Comparing methods for automatic acquisition of Topic Signatures

نویسندگان

  • Montse Cuadros
  • Lluis Padro
  • German Rigau
چکیده

The main goal of this work is to compare two methods for building Topic Signatures, which are vectors of weighted words acquired from large corpora. We used two different software tools, ExRetriever and Infomap, for acquiring Topic Signatures from corpus. Using these tools, we retrieve sense examples from large text collections. Both systems construct a query for each word sense using WordNet. The quality of the acquired Topic Signatures is indirectly evaluated on the Word Sense Disambiguation English Lexical Task of Senseval-2. keywords: Topic Signatures, acquisition, Latent Semantic Indexing, Word Sense Disambiguation, Multilingual Central Repository, WordNet.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Automated Acquisition of Topic Signatures for Text Summarization

In order to produce a good summary, one has to identify the most relevant portions of a given text. We describe in this paper a method for automatically training topic signatures{sets of related words, with associated weights, organized around head topics{and illustrate with signatures we created with 6,194 TREC collection texts over 4 selected topics. We describe the possible integration of to...

متن کامل

تولید خودکار الگوهای نفوذ جدید با استفاده از طبقه‌بندهای تک کلاسی و روش‌های یادگیری استقرایی

In this paper, we propose an approach for automatic generation of novel intrusion signatures. This approach can be used in the signature-based Network Intrusion Detection Systems (NIDSs) and for the automation of the process of intrusion detection in these systems. In the proposed approach, first, by using several one-class classifiers, the profile of the normal network traffic is established. ...

متن کامل

Automatic Acquisition of English Topic Signatures Based on a Second Language

We present a novel approach for automatically acquiring English topic signatures. Given a particular concept, or word sense, a topic signature is a set of words that tend to co-occur with it. Topic signatures can be useful in a number of Natural Language Processing (NLP) applications, such as Word Sense Disambiguation (WSD) and Text Summarisation. Our method takes advantage of the different way...

متن کامل

An Empirical Study for the Automatic Acquisition of Topic Signatures

The main goal of this work is to compare different methods for building Topic Signatures, which are vectors of weighted words acquired from large corpora. We used two different software tools, ExRetriever [Cuadros et al., 2004] and Infomap [Dorow and Widdows, 2003], for acquiring Topic Signatures from corpus. Using these tools, we retrieve sense examples from large text collections. We also inc...

متن کامل

Automatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation

Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005